National Repository of Grey Literature 3 records found  Search took 0.08 seconds. 
Hodnocení Výsledků Fuzzy Shlukování
Říhová, Elena ; Pecáková, Iva (advisor) ; Řezanková, Hana (referee) ; Žambochová, Marta (referee)
Cluster analysis is a multivariate statistical classification method, implying different methods and procedures. Clustering methods can be divided into hard and fuzzy; the latter one provides a more precise picture of the information by clustering objects than hard clustering. But in practice, the optimal number of clusters is not known a priori, and therefore it is necessary to determine the optimal number of clusters. To solve this problem, the validity indices help us. However, there are many different validity indices to choose from. One of the goals of this work is to create a structured overview of existing validity indices and techniques for evaluating fuzzy clustering results in order to find the optimal number of clusters. The main aim was to propose a new index for evaluating the fuzzy clustering results, especially in cases with a large number of clusters (defined as more than five). The newly designed coefficient is based on the degrees of membership and on the distance (Euclidean distance) between the objects, i.e. based on principles from both fuzzy and hard clustering. The suitability of selected validity indices was applied on real and generated data sets with known optimal number of clusters a priory. These data sets have different sizes, different numbers of variables, and different numbers of clusters. The aim of the current work is regarded as fulfilled. A key contribution of this work was a new coefficient (E), which is appropriate for evaluating situations with both large and small numbers of clusters. Because the new validity index is based on the principles of both fuzzy clustering and hard clustering, it is able to correctly determine the optimal number of clusters on both small and large data sets. A second contribution of this research was a structured overview of existing validity indices and techniques for evaluating the fuzzy clustering results.
Evaluation of the Success of Coefficients and Methods Used in Cluster Analysis
Hammerbauer, Jiří ; Löster, Tomáš (advisor) ; Makhalova, Elena (referee)
The diploma thesis explores with the evaluation of the success of selected indices for determining the number of clusters used in cluster analysis. The aim of this thesis is on the basis of various combinations of clustering methods and distances verify whether, alternatively using which clustering methods and distances is it possible to rely on the results of indices for determining the number of clusters. The results of success rate presented in the third chapter suggest that not all of indices for determining the number of clusters can be used universally. The most successful index is Dunn index, which was able to determine the correct number of clusters in 37 % of cases, respectively Davies-Bouldin index with the share of 70 % when including deviation of one cluster. The success rate is affected by both used method and selected distance.
Evaluation of Cluster Analysis Methods
Löster, Tomáš ; Řezanková, Hana (advisor) ; Berka, Petr (referee) ; Dohnal, Gejza (referee)
Cluster analysis includes a range of methods and practices that are used primarily for classification of objects. It takes an important role in many areas. Since the resulting distribution of objects into clusters may vary depending on the selected methods and specifications, it is appropriate to assess the results obtained. This paper proposes new ways of evaluating these results in a situation where objects are characterized by qualitative variables or by variables of different types. These coefficients can be used either to compare different methods (in terms of better outcomes) or for finding of the optimal number of clusters. All of them are based on the detection of variability which is also used for measuring of dissimilarity of objects and clusters. The newly proposed evaluation methods are applied to real data sets (of different sizes, with different number of variables, including variables of different types) and the behavior of these coefficients in different conditions is being examined. These data sets have known as well as unknown classification of objects into clusters. The best coefficient for evaluating clustering results with different types of variables can be considered, based on the analysis carried out, the modified coefficient of CHF. Local maximum value according to which the results of the clustering are evaluated, almost always exists. The analysis has proven that in most cases this value meets the expected results of the well-known classification of objects into clusters. The existence of local extremes of the other coefficients depends on specific data sets and is not always feasible.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.